– an MT system for closely related languages

نویسنده

Jan HRIC

چکیده

The demonstration of our system addresses one very important part of the translation business – the localization of texts and programs from one source language into a group of mutually related target languages. It shows step by step a simple method of machine translation between related languages and its incorporation into an existing commercial translation aid using the concept of translation memory. It is quite clear that the localization of the same source into several typologically similar target languages, one language pair after another, is a waste of money and effort. In the translation process it is necessary to solve very similar problems for each source-target language pair. The use of one language from the target group as a pivot and to perform the translation and localization through this language seems to be quite natural solution of these problems. It is of course much easier to translate texts from Czech to Polish or from Russian to Bulgarian than from English or German to any of these languages. Introduction As part of our “pivot” language solution, we are using a combination of an MT system with a commercial machine aided translation (MAT) system. We are using the TRADOS system, although any such system will do. The system uses the concept of translation memory, which contains pairs of previously translated sentences from a source to a target language. When a human translator starts translating a new sentence, the system tries to match (with a degree of similarity set by a user) the source with sentences already stored in the translation memory. If found, the human translator decides whether to use it, to modify it or to reject it. 1 Translation Memory Integration The segmentation of the translation memory (the texts are stored as relevant pairs of source/target language sentences) is the key feature of our method. The translation memory may be exported into a text file and thus allows for an easy manipulation with its content. Let us suppose that we have at our disposal two translation memories – one human made for the source/pivot language pair and the other created by an MT system for the pivot/target language pair. The substitution of segments of a pivot language by the segments of a target language is then only a routine procedure. The human translator translating from the source to the target language then gets a translation memory for the required pair (source/target); there is no trace of the pivot language left. The system of penalties applied in TRADOS Translator’s Workbench guarantees that a previous humanmade translation present in the memory gets higher priority than the automatic translation. This method has at least three advantages: – The use of machine-made translation memory only as a resource supporting the direct human translation from the source to the target language has no negative effect on the quality of translation and from the user’s point of view. – There is no difference (except for the small difference in the quality of translation memories) when our method is used compared to the original process of working with the support of solely human-made translation memories. – The third advantage is the fact that given a sufficient quality of the MT from the pivot to the target language, our method may substantially increase the speed and reduce the costs of the translation from the source to the target languages. 2 The System SÍLKO The system !

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tagging as a Key to Successful Mt

This paper describes the key role of a stochastic morphological tagger in an MT system between very closely related languages. The MT system Česílko exploits the close relatedness of both natural languages in question (Czech and Slovak), which allows substantial simplification of the translation method used. It also uses to a great advantage the possibilities of combination of a human translati...

متن کامل

Control and Cybernetics a Method of Hybrid Mt for Related Languages *

The paper introduces a hybrid approach to a very specific field in machine translation — the translation of closely related languages. It mentions previous experiments performed for closely related Scandinavian, Slavic, Turkic and Romanic languages and describes a novel method, a combination of a simple shallow parser of the source language (Czech) combined with a stochastic ranker of (parts of...

متن کامل

Translating from under-resourced languages: comparing direct transfer against pivot translation

In this paper we compare two methods for translating into English from languages for which few MT resources have been developed (e.g. Ukrainian). The first method involves direct transfer using an MT system that is available for this language pair. The second method involves translation via a cognate language, which has more translation resources and one or more advanced translation systems (e....

متن کامل

A Comparison of MT Methods for Closely Related Languages: a Case Study on Czech - Slovak Language Pair

This paper describes an experiment comparing results of machine translation between two closely related languages, Czech and Slovak. The comparison is performed by means of two MT systems, one representing rule-based approach, the other one representing statistical approach to the task. Both sets of results are manually evaluated by native speakers of the target language. The results are discus...

متن کامل

Structural Similarities in MT A Bulgarian-Polish case

This paper shows that although it seems relatively easy to translate between closely related languages, not every framework manages to capture important details in the argument structure. By combining methods tested for translation between Swedish and Norwegian and assuming a compact theory of argument structure, I think that we can achieve better results in an MT system that deals with Slavic ...

متن کامل

Testing the Limits - Adding a New Language to an MT System

This paper deals with a problem of an application of an MT method developed for a pair of very closely related languages to a pair of languages whose degree of relatedness (and thus also the degree of similarity) is lower. The close relatedness of the original language pair (Czech and Slovak) allowed a substantial simplification of the translation method used. This paper provides an overview of...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

– an MT system for closely related languages

نویسنده

چکیده

منابع مشابه

Tagging as a Key to Successful Mt

Control and Cybernetics a Method of Hybrid Mt for Related Languages *

Translating from under-resourced languages: comparing direct transfer against pivot translation

A Comparison of MT Methods for Closely Related Languages: a Case Study on Czech - Slovak Language Pair

Structural Similarities in MT A Bulgarian-Polish case

Testing the Limits - Adding a New Language to an MT System

عنوان ژورنال:

اشتراک گذاری